Wanderlust: Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns
نویسندگان
چکیده
A great share of applications in modern information technology can benefit from large coverage, machine accessible knowledge bases. However, the bigger part of todays knowledge is provided in the form of unstructured data, mostly plain text. As an initial step to exploit such data, we present Wanderlust, an algorithm that automatically extracts semantic relations from natural language text. The procedure uses deep linguistic patterns that are defined over the dependency grammar of sentences. Due to its linguistic nature, the method performs in an unsupervised fashion and is not restricted to any specific type of semantic relation. The applicability of the proposed approach is examined in a case study, in which it is put to the task of generating a semantic wiki from the English Wikipedia corpus. We present an exhaustive discussion about the insights obtained from this particular case study including considerations about the generality of the approach.
منابع مشابه
Extracting Noun Phrases in Subject and Object Roles for Exploring Text Semantics
In tune with the recent developments in the automatic retrieval of text semantics, this paper is an attempt to extract one of the most fundamental semantic units from natural language text. The context is intuitively extracted from typed dependency structures basically depicting dependency relations instead of Part-Of-Speech tagged representation of the text. The dependency relations imply deep...
متن کاملSemi-Supervised Convolution Graph Kernels for Relation Extraction
Extracting semantic relations between entities is an important step towards automatic text understanding. In this paper, we propose a novel Semi-supervised Convolution Graph Kernel (SCGK) method for semantic Relation Extraction (RE) from natural English text. By encoding sentences as dependency graphs of words, SCGK computes kernels (similarities) between sentences using a convolution strategy,...
متن کاملReview: Realization with CCG
Here I give an overview of recent work on natural language realization with Combinatory Categorial Grammar, done by Michael White and his colleagues, with some more specific descriptions of the algorithms used, where they were unclear to me. In particular, I focus on the work presented in his 2007 paper [5], in which White et al describe a process for extracting a grammar from the CCGBank and u...
متن کاملExtracting Semantic Relations Using Dependency Paths
Lexical semantic relations are useful in a variety of natural language applications, but to collect, update and maintain them by hand is tedious and costly. We experiment with the use of dependency paths, rather than regular expressions, as the formalism for representing patterns that may be indicative of a semantic relation between noun pairs. Our corpus is Wikipedia article abstracts, and we ...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کامل